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Novel estrogen receptor 

This invention relates to the field of receptors belonging 
tc the superfamily of nuclear hormone receptors, in particular 
s to steroid receptors. The invention relates to DMA encoding a 

novel steroid receptor, the preparation of said receptor, the 
receptor protein, and the uses thereof. 

Steroid receptors belong to a superf amily of nuclear 

10 hormone receptors involved in ligand-dependent transcript: onal 

control of gene expression. In addition, this superf amily 
consists of receptors for non-steroid hormones such as 
vitamine D, thyroid hormones and retinoids (Giguere et al, 
Nature 330, 624-629, 1987; Evans, R.M., Science 240, 889- 

is 895,1938). Moreover, a range of nuclear receptor-like 

sequences have been identified which encode socailed ^orphan' 
receptors: these receptors are structurally related to an; 
therefore classified as nuclear receptors, although no 
putative ligands have been identified yet (B.W. O'Maliey, 

20 Endocr ; nclogy 125, 1119-1170, 1989; D.J. Mangelsdorf and i_M. 

Evans, Cell, _83, 841-850, 1995) . 

The superfamily of nuclear hormone receptors share a 

^) modular structure in which six distinct structural and 

functional domains, A to F, are displayed (Evans, Science 240, 
889-895, 1988). A nuclear hormone receptor is characterized by 
a variabel N-termmal region (domain A/B) , followed by a 
centrally located, highly conserved DNA-binding domain 
(herei: ^fzer referred to as D3D; domain C) , a variable hi:ge 
region (domain D) , a conserved iigand-binding domain (her-: in 

30 after referred to as LBD; domain E) and a variable C-term: nal 

region (domain F) . 

The N-terminal region, which is highly variable in siz? and 
sequence, is poorly conserved among the different members of 
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the super family. This pare of the receptor is involved in the 
modulation of transcription activation (Bocquel et al, Nucl. 
Acid Res,, 17, 2581-2595, 1989; Tora et al, Cell 59, 477-487, 
1989) . 

5 The D3D consists of approximately 66 to 70 amino acids and 

is responsible for DNA-binding activity: it targets the 
receptor to specific DNA sequences called hormone responsive 
elements (hereinafter referred to as HRE) within the 
transcription control unit of specific target genes on the 

10 chromatin (Martinez and Wahli, In 'Nuclear Hormone Receptcrs', 

Acad. Press, 125-153, 1991) . 

The L3D is located in the C-terminal part of the receptor 
and is primarily responsible for ligand binding activity. In 
this way, the L3D is essential for recognition and binding of 

lb the hormone ligand and, in addition possesses a transcript. ion 

activation function, thereby determining the specificity e.nd 
selectivity of the hormone response of the receptor. Al the ugh 
moderately conserved in structure, the LBD' s are known to vary 
considerably in homology between the individual members of the 

20 nuclear hormone receptor superfamily (Evans, Science 240, 889- 

895, 1988; P.J. Fuller, FASEB J., 5, 3092-3099, 1991; 
Mangelsdorf et al, Cell, Vol. 83, 835-839, 1995) . 

Functions present in the N-terminal region, LBD and DBD 
operate independently from each other and it has been shown 

25 that these domains can be exchanged between nuclear receptors 

(Green et al, Nature, Vol. 325, 75-78, 1987). This results in 
chimeric nuclear receptors, such as described for instance in 
WO-A-8905355. 



30 



When a hormone ligand for a nuclear receptor enters the 
cell by diffusion and is recognized by the LBD, it will bind 
to the specific receptor protein, thereby initiating an 
allosteric alteration of the receptor protein. As a result of 
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this alteration the l.lgand/receptor coTr.pl ex switches to a 
transcriptionally active state and as such is able to bind 
through the presence of the DBD with high affinity to the 
corresponding HRE on the chromatin DNA (Martinez and Wahli, 
'Nuclear Hormone Receptors' , 125-153, Acad. Press, 1991). In 
this way the ligand/receptor complex modulates expression of 
the specific target genes. The diversity achieved by this 
family of receptors results from their ability to respond to 
different ligands . 



The steroid receptors are a distinct class of the nuclear 
receptor superf amily, characterized in that the putative 
ligands are steroid hormones. The receptors for 

15 glucocorticoids (GR) , mineralcorticoids (MR) , progesterone 

(PR) , androgens (AR) and estrogens (ER) are classical steroid 
receptors. Furthermore/ the steroid receptors have the unigue 
ability upon activation to bind to palindromic DNA sequences, 
the so called HRE' s, as homodimers . The GR, MR, PR and AR 

20 recognize the same DNA sequence, while the ER recognizes a 

different DNA sequence. (Beato et al, Cell, Vol. 83, 851-657, 
1995) . After binding to DNA, the steroid receptor is thought 
to interact with components of the basal transcriptional 

^) machinery and with sequence-specific transcription factors, 

25 thus modulating the expression of specific target genes. 

Several HRE' s have been identified, which are responsive to 
the hormone/receptor complex. These HRE' s are situated in the 
transcriptional control units of the various target genes such 
as mammalian growth hormone genes (responsive to 

3C Glucoccticoid, Estrogen, Testosterone) , mammalian prolactin 

genes and progesterone receptor genes (responsive to 
Estrogen) , avian ovalbumin genes (responsive to Progesterone) , 
mammalian nethalo thionein gene (responsive to Glucocorticoid) 
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and mammalian hepatic ct^-giobulin gene (responsive to 
Estrogen, Testosterone, Glucocorticoid) - 



5 The steroid recepLors have been known to be involved in 

embryonic development, adult homeostasis as well as organ 
physiology. Various diseases and abnormalities have been 
ascribed to a disturbance in the steroid hormone pathway. 
Since the steroid receptors exercise their influence as 

ao hormone-activated transcriptional modulators, it can be 

anticipated that mutations and defects in these receptors, as 
well as overstimulation or blocking of these receptors might 
be the underlying reason for the altered pattern- A better 
knowledge of these receptors, their mechanism of action and of 

is the ligands which bind to said receptor might help to create a 

better insight in the underlying mechanism of the hormone 
pathway, which eventually will lead to better treatment oi the 
diseases and abnormalities linked to altered hormone/receptor 
functioning . 

20 For this reason cDNA' s of the steroid and several other 

nuclear receptors of several mammalians, including humans, 
have been isolated and the corresponding amino acid sequence 
have been deduced, such as for example the human steroid 

™ receptors PR, ER, GR, MR, and AR, the human non-steroid 

7 Q receptors for vitamine D, thyroid hormones, and retinoids such 

as retinol A and retinoic acid. In addition, cDNA' s well ever 
ICO mammalian orphan receptors have been isolated, for which 
no putative ligands are known yet (Mangelsdorf et al, Cell, 
vol. 83, 635-839, 1995) .However, there is still a great ne:-d 

33 for the elucidation of other nuclear receptors, in order to 

unravel the various roles these receptors play in normal 
physiology and pathology. 
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The present invention provides for such a novel nuclear 
receptor .More specific, the present invention provides for 
novel steroid receptors, having estrogen mediated activity. 
Said novel steroid receptors are novel estrogen receptors, 
5 which are able to bind and be activated by, for example, 

estradiol, estrone and estriol. 

According to the present invention it has been found that a 
novel estrogen receptor is expressed as an 8 kb transcript in 
human thymus/ spleen, peripheral blood lymphocytes (P3Ls), 

ic ovary and testis. Furthermore, additional transcripts have 

been identified. In testis, an additional transcript of 1 3 kb 
was detected. Another transcript of approximately 10 kb we- s 
identified in ovary, thymus and spleen. These two transcripts 
are probably generated by alternative splicing of the gene 

is encoding the novel estrogen receptor according to the 

invention , 

Cloning of the cDNA' s encoding the novel estrogen receiptors 
according to the invention revealed that several splicing 
variants of said receptor can be distinguished. At the prctein 
20 level, these variants differ only at the C-terirtinal part. 

It is true that an estrogen receptor is already known: cDNA 
encoding the classical ER was isolated (Green, et al, Nature 
™ 320, 134-139, 1986; Greene et al, Science 231, 1150-1154, 

?s 1986), and its amino acid sequence deduced. Although both ER' s 

share a great deal of homology, the amino acid sequence cf. 
both receptors vary considerably. The homology between the 
class! :al ER and the novel ER' s according to the invention 
resides predominantly in the DBD' s and LBD' s of said 

30 receptors- Thus, the two receptors are distinct, encoded for 

by different genes, which belong to the subclass of estrogen 
receptors . 

Furthermore, two orphan receptors, ERRa and ERRp, having an 
estrogen receptor related structure have been described . 
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Based on the structural relatedness of ERRa and ERR{3 with the 
classic ER, rhese orphans are considered to be members of the 
estrogen receptor subclass. These receptors, however, have not 
been reported to be able to bind estrodial or any other 
s hormone that binds to the classic ER, and other ligands which 

bind to these receptors have not been found yet- The novel 
estrogen receptor according to the invention distinguishes 
itself clearly from these receptors since it was found tc bind 
estrogens > 

io The fact that a novel ER according to the invention has 

been found is all the more surprising, since any suggestion 
towards the existence of additional estrogen receptors was 
absent in the scientific literature: neither the isolation of 
the classical ER nor the orphan receptors ERRa and ERRP 

15 suggested or binned towards the presence of additional 

estrogen receptors such as the receptors according to the 
invention. The identification of additional ER' s could be a 
major step forward for the existing clinical therapies, which 
are based on the presence of one ER and as such ascribe all 

20 estrogen mediated abnormalities and/or diseases to this one 

receptor. The presence of additional estrogen receptors, such 
as the receptors according to the invention will be useful in 
the development of hormone analogs that selectively activate 
either the classic ER or the novel estrogen receptor according 
to the invention. This should be considered as one of the 
major advantages of the present invention. 

Thus, in one aspect, the present invention provides for 
isolated cDNA encoding a novel steroid receptor. In 
30 particular, the present invention provides for isolated cDNA 

encoding a novel estrogen receptor. 

According to this aspect of the present invention, there is 
provided an isolated DNA encoding a steroid receptor protein 
having an N-terminal domain, a DMA-binding domain and a 
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ligand-binding domain, wherein the amino acid sequence of said 
DNA-binding domain of said receptor protein exhibits at least: 
80% homology with the amino acid sequence shown in SEQ ID 
NO: 3, and the amino acid sequence of said ligand-binding 
5 domain of said receptor protein exhibits at least 10% homology 

with the amino acid sequence shown in SEQ ID NO : 4 . 

In particular, the isolated DNA encodes a steroid receptor 
protein having an N-terminal domain, a DNA-binding domain and 
a ligand-binding domain , wherein the amino acid sequence of 
10 said DNA-binding domain of said receptor protein exhibits at 

least 90S, preferably 95£, more preferably 98%, most 
preferably 100% homology with the amino acid sequence shown in 
SEQ ID NO: 3. 

More particularly, the isolated DNA encodes a steroid 
is receptor protein having an N^terminal domain, a DNA-binding 

domain and a ligand-binding domain , wherein the ajnino acid 
sequence of said ligand-binding domain of said receptor 
protein exhibits at least 75%, preferably 80%, more preferably 
90 Yi, most preferably 100% homology with the amino acid 
20 sequence shown in SEQ ID NO: 4, 

A preferred isolated DNA according to the invention encodes 
a steroid receptor protein havinq the amino acid sequence 
shown in SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 21 or SEQ ID 
™ NO:25, 

?s A more preferred isolated DNA according to the invention is 

an isolated DNA comprising a nucleotide sequence shown in SEQ 
ID N0:1, SEQ ID NO:2, SEQ ID NO:20 or SEQ ID NO:24. 

The DNA according to the invention may be obtained fro^ 
cDNA. Alternatively, the coding sequence might be genomic DNA, 

30 or prepared using DNA synthesis techniques. 

The DNA according to the invention will be very useful for 
in vivo expression of the novel receptor proteins according to 
the invention in sufficient quantities and in substantially 
pure form. 
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In another aspect of the invention, there is provided for a 
steroid receptor comprising the amino acid sequence encoded by 
the above described DNA molecules. 

The steroid receptor according to the invention has an N- 
terminai domain, a DNA-binding domain and a ligand-binding 
domain , wherein the amino acid sequence of said DNA-binding 
domain of said receptor exhibits at least 80% homology with 
the amino acid sequence shown in SEQ ID NO: 3, and the amino 
acid sequence of said ligand-binding domain of said receptor 
exhibits at least 70% homology with the amino acid sequence 
shown in SEQ ID NO: 4. 

In particular, the steroid receptor according to the 
invention has an N-terminal domain, a DNA-binding domain and a 
ligand-binding domain , wherein the amino acid sequence of 
said DNA-binding domain of said receptor exhibits at least 
90%, preferably 95%, more preferably 98%, most preferably 100% 
homology with the amino acid sequence shown in SEQ ID NO: 3. 

More particular, the steroid receptor according to the 
invention has an N-terminal domain, a DNA-binding domain and a 
ligand-binding domain , wherein the amino acid sequence of 
said ligand-binding domain of said receptor exhibits at l&ast 
75%, prefearbly 80%, more preferably 90%, most preferably 1005 
homology with the amino acid sequence shown in SEQ ID NO: 4. 

It will be clear for those skilled in the art that also 
steroid receptor proteins comprising combined DBD and LBD 
preferences and DNA encoding such receptors are subject o.t the 
invention , 

Preferably, the steroid receptor according to the invention 
comprises an amino acid sequence shown in SEQ ID NO: 5, SEQ ID 
NO: 6, SEQ ID NO : 2 1 or SEQ ID NO:25. 

Also within the scope of the present invention are steroid 
receptor proteins which comprise variations in the amino acid 
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sequence of the DBD and L3D without loosing their respective 
DNA-binding or ligand-binding activities* The variations that 
can occur in those amino acid sequence comprise deletions, 
substitutions , insertions, inversions or additions of (an) 
5 amino acid(s) in said sequence, said variations resulting in 

amino acid difference (s) in the overall sequence. It is well 
known in the art of proteins and peptides that these amino 
acid differences lead to amino acid sequences that are 
different from, but still homologous with the native amino 

10 acid sequence they have been derived from. - 

Amino acid substitutions that are expected not to 
essentially alter biological and immunological activities, 
have been described in for example Dayhof, M.D., Atlas of 
protein sequence and structure, Nat. Biomed. Res. Found., 

is Washington D.C., 1978, vol. 5, suppl . 3. Amino acid 

replacements between related amino acids or replacements which 
have occurred frequently ^ n evolution are, inter alia Ser/Ala, 
Ser/Gly, Asp/Gly, Asp/Asn, Tle/Val. "Rased on this information 
Lipman and Pearson developed a method for rapid and sensitive 

20 protein comparison (Science 227, 1435*1441, 1985) and 

determining the functional similarity between homologous 
polypeptides . 

Variations in amino acid sequence of the DBD according to 
the invention resulting in an amino acid sequence that has at 

2* least 30% homology with the segunece of SEQ ID NO: 3 will lead 

to receptors still having sufficient DNA binding activity. 
Variations in amino acid sequence of the LBD according to the 
invention resulting in an amino acid sequence that has at 
least 10% homology with the sequnece of SEQ ID NO: 4 will lead 

30 to receptors still having sufficient ligand binding activity. 

Homology as defined herein is expressed in percentages, 
determined via PCGENE. 
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Comparing the amino acid sequence of the classic ER and the 
ER' s according to the invention revealed a high degree of 
similarity within their respective DBD's. The conservation of 
the P-box (amino acids E-G-X-X-A) which is responsible for the 
actual interactions of ERa with the target DNA element 
(Zilliacus et al., Mol.Endo. 9, 389, 1995; Glass, End. Rev. 15, 
391/ 1994), is indicative for a recognition of estrogen 
responsive elements (ERE's) by the ER' s according to the 
invention. Therefore, the classical ER and novel ER' s 
according to the invention may have overlapping target gene 
specificities. This could indicate that in tissues which co- 
express both respective ER's, these receptors compete for 
ERE's. The ER r s according to the invention may regulate 
transcription of target genes differently from classical ER 
regulation or could simply block classical ER functioning by 
occupying estrogen responsive elements. 

Thus, a preferred steroid receptor according to the 
invention comprises the amino acid sequence E-G-X-X-A within 
the P box of the DNA binding domain, 'wherein X stands for any 
amino acid. Also within the scope of the invention is isolated 
DNA encoding such a receptor. 

Methods to prepare the receptors according to the invention 
are well known in the art (Sambrook et al., Molecular Cloning: 
a Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, latest edition) . The most practical approach is 
to produce these receptors by expression of the DNA encoding 
the desired protein. 

A wide variety of host cell and cloning vehicle 
combinations may be usefully employed in cloning the nucleic 
acid sequence coding for the receptor of the invention. For 
example, useful cloning vehicles may include chromosomal, non- 
chromosomal and synthetic DNA sequences such as various known 
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bacterial plasmids and wider host range plasmids and vectors 
derived from combinations of plasmids and phage or virus DKA. 
Useful hosts may include bacterial hosts, yeasts and other 
fungi, plant or animal hosts, such as Chinese Hamster Ovary 
5 (CKC) cells or monkey cells and other hosts. 

Vehicles for use in expression of the ligand-binding domain 
of the present invention will further comprise control 
sequences operably linked to the nucleic acid sequence coding 
for the ligand-binding domain. Such control sequences 

10 generally comprise a promoter sequence and sequences which 

regulate and/or enhance expression levels. Furthermore an 
origin of replication and/or a dominant selection marker ere 
often present in such vehicles. Of course control and other 
sequences can vary depending on the host cell selected. 

is Techniques for transforming or transfecting host cells are 

quite known in the art (see, for instance, Maniatis et al.. 
Molecular Cloning: A Laboratory Manual/ Cold Spring Harbor 
Laboratory, 1982 and 1989) . 

Recombinant expression vectors comprising the DNA of the 

20 invention as well as cells transformed with said DNA or said 

expression vector also form part of the present invention. 



In a further aspect of the invention, there is provided for 
25 a chimeric receptor protein having an N-terminal domain, a 

DNA-binding domain, and a ligand-binding domain, characterized 
in that at least one of the domains originates from a receptor 
protein according to the invention, and at least one of t^e 
other domains of said chimeric protein originates from anc ther 
30 receptor protein from the nuclear receptor superfamily, 

provided that the DMA-binding domain and the ligand-binding 
domain of said chimerj c receptor protein originate from 
different Droteins, 
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In particular, the chimeric receptor according to the 
invention comprises the LBD according to the invention, said 
LED having an amino acid sequence which exhibits at least 7 0% 
homology with the amino acid sequence shown in SEQ ID NO : 4 . In 
5 that case the NT-terminal domain and DBD should be derived from 

another nuclear receptor, such as for example PR. In this way 
a chimeric receptor is constructed which is activated by a 
ligand of the ER according to the invention and which targets 
a gene under control of a progesterone responsive element. The 

10 chimeric receptors having a LBD according to the invention, are 

useful for the screening of compounds to identify novel 
ligands or hormone analogs which are able to activate an I R 
according to the invention. 

In addition, chimeric receptors comprising a DBD according 

is to the invention, said D3D having an amino acid sequence 

exhibiting at least 80% homology with the amino acid sequence 
shown in SEQ ID NO: 3, and a LBD and, optionally, an N-terminal 
domain derived from another nuclear receptor, can be 
succesfully used to identify novel ligands or hormone analogs 

20 for said nuclear receptors. Such chimeric receptors are 

especially useful for the identification of the respective-, 
ligands of orphan receptors * 

Since steroid receptors have three domains with different 
functions, which are more or less independent, it is possible 

25 that all three functional domains have been derived from 

different members of the steroid receptor superfamily. 

Molecules which contain parts having a different origin are 
called chimeric. Such a chimeric receptor comprising the 
1 i gand-binding domain and/or the DNA-binding domain of the 

3c invention may be produced by chemical linkage, but most 

preferably the coupling is accomplished at the DNA level with 
standard molecular biological methods by fusing the nucleic 
acid sequences encoding the necessary steroid receptor 
domains „ Hence, DNA encoding the chimeric receptor proteins 
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according to the invention are also subject of the present 
invention . 

Such chimeric proteins can be prepared by transfecting DMA 
encoding these chimeric receptor proteins to suitable host 
5 cells and culturing these cells under suitable conditions. 

It is extremely practical if/ next to the information for 
the expression of the steroid receptor, also the host cell is 
transformed or transfected with a vector which carries the 

ao information for a reporter molecule. Such a vector coding for 

a reporter molecule is characterized by having a promoter 
sequence containing one or more hormone responsive elements 
(HRE) functionally linked to an operative reporter gene. Such 
a HRE is the DNA target of the activated steroid receptor and, 

is as a consequence, it enhances the transcription of the DNA 

coding for the reporter molecule. In in vivo settings of 
steroid receptors the reporter molecule comprises the cellular 
response to the stimulation of the ligand. However, it is 
possible in vitro to combine the ligand-binding domain of a 

20 receptor to the DNA binding domain and transcription 

activating domain of other steroid receptors, thereby enabling 
the use of other HRE and reporter molecule systems. One such a 
system is established by a HRE presented in the MMTV-LTR 
(mouse mammary tumor virus long terminal repeat sequence in 

25 connection with a reporter molecule like the firefly 

luciferase gene or the bacterxal gene for CAT (chloramphenicol 
transferase) . Other hre 1 s which can be used are the rat 
oxytocin promotor, the retinoic acid responsive element, the 
thyroid hormone responsive element, the estrogen responsive 

30 element and also synthetic responsive elements have been 

described (for instance in Fuller, ibid, page 3096). As 
reporter molecules next to CAT and luciferase li-galactcsidase 
can be used. 
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Steroid receptors and chimeric receptors according to the 
present invention can be used for the In vitro identification 
of novel iigands or hormonal analogs. For this purpose binding 
studies can be performed with cells transformed with DNA 
according to the invention or an expression vector comprising 
DNA according to the invention, said cells expressing the 
steroid receptors or chimeric receptors according to the 
invention . 



The novel steroid receptor and chimeric receptors according 
to the invention as well as the ligand-binding domain of the 
invention, can be used in an assay for the identification of 
functional Iigands or hormone analogs for the nuclear 
is receptors . 

Thus, the present invention provides for a method for 
identifying functional Iigands for the steroid receptors and 
chimeric receptors according to the invention, said method 
comprising the steps of 
20 a) introducing into a suitable host cell 1) DNA or an 

expression vector according to the invention, and 2) 
a suitable reporter gene functionally linked to an 
operative hormone response element, said HRE being 
able to be activated by the DNA-binding domain cf 
25 the receptor protein encoded by said DNA; 

b) bringing the host cell from step a) into contact 

with potential Iigands which will possibly bind to 
the ligand-binding domain of the receptor protean 
encoded by said DNA from step a) ; 
30 C; monitoring the expression of the receptor protean 

encoded by said reporter gene of step a) . 
If expression of the reporter gene is induced with respect 
to basic expression (without ligand), the functional ligand 
can be considered as an agonist; if expression of the reporter 
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gene remains unchanged or is reduced with respect to basi: 
expression, the functional ligand can be a suitable (partial) 
antagonist . 

For performing such kind of investigations host cells which 
have been transformed or transfected with both a vector 
encoding a functional steroid receptor and a vector having the 
information for a hormone responsive element and a connected 
reporter molecule are cultured in a suitable medium. After 
addition of a suitable ligand, which will activate the 
receptor the production of the reporter molecule will be 
enhanced, which production simply can be determined by assays 
having a sensitivity for the reporter molecule. See for 
instance WO-A-88031 68 . Assays with known steroid receptors 
have been described (for instance S. Tsai et al., Cell 51, 
443, 1989; M. Meyer et al . , Cell 57, 433, 1989). 



Legends to the figures 
Figure 1. 

Northern analysis of the novel estrogen receptor. Two 
different multiple tissue Northern blots (Clontech) were 
hybridised with a specific probe for the novel estrogen 
receptor (see examples). Indicated are the human tissues the 
RNA originated from and the position of the size markers in 
kilobases (kb) . 

Figure 2. 

Histogram showing the 3- to 4-fold stimulatory effect of 
17(3-estradiol/ estrioi and estrone on the luciferase activity 
mediated by the novel estrogen receptor. An expression vector 
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encoding the novel estrogen receptor was transiently 
transfected into CHO ceils together with a reporter construct 
containing the rat oxytocin promoter in front of ihe firefly 
luciferase encoding sequence (see examples) . 



Examples 



A. Molecular cloning of the novel estxogen receptor. 

10 Two degenerate oligonucleotides containing inosines (I) 

were based on conserved regions of the DNA-binding domains and 
the ligand-binding domains of the human steroid hormone 
receptors . 

Primer #1 : 

15 5' -GG F GA ( C/T ) GA(A/G) GC (A/T) TCIGGTTG (C/T ) CA (C/T ) TA(C/T) 

TA(C/T)GG-3 r (SFQ TD NO:7). 
Primer #2: 

5' -AAGCCTGG (C/G) A(C/T) IC (G/T) (C/T) TTIGCCCAI (C/T) TIAT-3' SEQ 
ID NO: 8) . 

20 As template, cDNA from human EBV-s timulated PBLs 

(peripheral blood leukocytes) was used. One microgram of total 
RNA was reverse transcribed in a 20 |Al reaction containing 50 
mM KCi, 10 Tris-HCl pH 8.3, 4 mM KgCl2, 1 mM dNTPs 
(Pharmacia) , 100 pmol random hexanucleotides (Pharmacia) , 30 

25 Units Rnase inhibitor (Pharmacia) and 200 Units M-MLV Reverse 

transcriptase (Gibco BRL) . Reaction mixtures were incubated at 
37°C for 30 minutes and heat-inactivated at 100°C for 5 
minutes. The cDNA obtained was used in a 100 PCR reaction 
containing 10 mM Tris-HCl pH 8.3, 50 mM KCI, 1 . 5 mM MgC12, 

30 0.001* gelatin (w/v) , 3% DKSO, 1 microgram of primer #1 and 

primer #2 and 2.5 Units of Amplitaq DNA polymerase (Perkm 
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Elmer) . PCR reactions were performed in the Perkin Elmer 9600 
thermal cycler. The initial denaturat ion (4 minutes at 94°C) 
was followed by 36 cycles with the following conditions: 30 
sec. 94°C, 30 sec. 45 C C, 1 minute 72°C and after 7 minutes at 
s 72°C the reactions were stored at 4°C. Aliquots of these 

reactions were analysed on a 1.5% agarose gel. Fragments of 
interest were cut out of the gel, reamplified using identical 
PCR-conditions and purified using Qiaex II (Qiagen) - Fragments 
were cloned in the PCRII vector and transformed into bacteria 

10 using the TA-cloning kit (Invitrogen) . Plasmid DNA was 

isolated for nucleotide sequence analysis using the Qiagen 
plasmid midi protocol (Qiagen) . Nucleotide sequence analysis 
was performed with the ALF automatic sequencer (Pharmacia) 
using a T7 DNA sequencing kit (Pharmacia) with vector-specific 

15 or fragment-specific primers. 

One cloned fragment corresponded to a novel estrogen 
receptor (ER) which is closely related to the classical 
estrogen receptor. Part of the cloned novel estrogen, receptor 
fragment (nucleotides 466 to 797 in SEQ ID 1) was amplified by 

20 PGR using oligonucleotide #3 TGTTACGAAGTGGGAATGGTGA (SEQ ID 

NO: 9) and oligonucleotide #2 and used as a probe to screen a 
human testis cDNA library in Xgtll (Clontech #HLl010b) . 

^) Recombinant phages were plated (using Y1090 bacteria grown in 
LB medium supplemented with 0.2% maltose) at a density of 
40.0C0 per 135 mm dish and replica filters (Hybond-N, 
Air.ersham) were made as described by the supplier. Filters were 
prehybridised in a solution containing 0.5 M phosphate bu::fer 
(pH 7. c j) and 7£ SDS at 65°C for at least 30 minutes. DNA 
probes were purified with Qiaex II (Qiagen) , 32P-labeled with 

30 a Decaprime kit (Ambion) and added to the prehybro.disat.ion 

solution. Filters were hybridised at 65°C overnight and then 
washed in 0.5 X SSC/0.1?, SDS at 65°C. Two positive plaques 
were identified and could be shown to be identical. These 
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clones were purified by re screening one more tine. A PCR 
reaction cn the phage eluates with the ^gtll-specif ic primers 
#4: 5 f -TTGACACCAGACCAACTGGTAATG-3 ' (SEQ ID NO: 10) and. #5: 5'- 
GGTGGCGACGACTCCTGGAGCCCG-3' (SEQ ID NO : 11) yielded a fragment 
5 of 1700 basepairs on both clones. Subsequent PCR reactions 

using combinations of a gene-specific primer #6: 5'- 

GTACACTGATTTGTAGCTGGAC-3' (SEQ ID NO; 12) with the X.gtll primer 
#4 and gene-specific primer #7: 5' -CCATGATGATGTCCCTGACC-3 ' 
(SEQ ID NO: 13) with Xgtll primer primer #5 yielded fragmerts 

10 of 450 bp and 1000 bp, respectively, which were cloned in the 

PCRII vector and used for nucleotide sequence analysis. The 
conditions for these PCR reactions were as described above 
except for the primer concentrations (200 ng of each primer) 
and the annealing temperature (60°C) ► Since in the cDNA cl :>ne 

is the homology with the ER is lost abruptly ar a site which 

corresponds to the exon 7/exon 8 boundary in the ER, it was 
suggested that this sequence corresponds to intron 7 of the 
novel ER gene. For verification of the nucleotide sequences of 
this cDNA clone, a 1200 bp fragment was generated on the cDNA 

20 clone with Xgtll primer #4 with a gene-specific primer #8 

corresponding to the 3 r end of exon 7: 5'~ 

TCGCATGCCTGACGTGGGAC- 3 r (SEQ ID NO: 14) using the proofreading 
Pfu polymerase (Stratagene) . This fragment was also cloned in 
the PCRII vector and completely sequenced and was shown to be 
2 identical r.o the sequences obtained earlier. 

To obtain nucleotide sequences of the novel ER downstream 
of exc:\ 7, a degenerate oligonucleotide based on the AF-2 
region of the classic ER (#9: 5' - 

GGC (C/G) TCCAGCATCTCCAG (C/G) A(A/G) CAG-3' ; SEQ ID NO:15) wa^ 
3C used together with the gene- speci f ic oligonucleotide #10: 5'- 

GGAAGCTGGCTCACTTGCTG-3 ' (SEQ ID NO: 16) using testis r.DNA as 
template (Marathon ready testis cDNA, r.lontech Cat #7414-1) - A 
specific 220 bp fragment corresponding to nucleotides 1112 to 
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1332 in SEQ ID No, 1 was cloned and sequenced and was shown to 
contain high homology with the corresponding region in the 
classic ER. In order to obtain sequences of the novel ER 
downstream of the AF-2 region, RACE (rapid amplification of 
s CD'S A ends) PCR reactions were performed using the Marathor- 

ready testis cDNA (Clontech) as template. The initial PCR was 
performed using oligonucleotide #11: 5' -TCTTGTTC'TGGACAGGGATG- 
3 r (SEQ ID NO: 17) in combination with the API primer provided 
in the kit. A nested PCR was performed on an aliquot of this 
id reaction using oligonucleotide #10 in combination with the 

oligo dT primer provided in the kit. Subsequently, an aliquot 
of this reaction was used in a nested PCR using 

oligonucleotide #12: 5' -GCATGGAACATCTGCTCAAC-3' (SEQ ID NO: 18) 
in combination with the oligo dT primer. Nucleotide sequer :e 
15 analysis of a specific fragment that was obtained 

(corresponding to nucleotides 1256 to 1431 in SEQ ID NO 1; 
revealed a sequence encoding the carboxyterminus of the novel 
ER ligand-binding domain, including an F-domain and a 
trans lazional stopcodon. 

20 

In order to investigate the possibility that the novel 
estrogen receptor had additional, upstream translation- 
initiation codons, RACE -PCR experiments were performed using 
Marathon-ready testis cDNA (Clontech Cat. # 7414-1). First a 
PCR was performed using oligonucleotide SEQ ID NO: 26 
(antisense corresponding to nucleotides 416-396 in SEQ ID 
NO:l) and AP-1 (provided in the kit). A nested PCR was then 
performed using oligonucleotide having SEQ ID NO: 27 (antisense 
corresponding to nucleotides 254-241 in SEQ ID NO:l) with AP-2 
3C (provided in the kit) . From the smear that was obtained, the 

region corresponding to fragments larger than 3C0 basepairs 
was cut out, purified using the Genecleanll kit (BiolOl) and 
cloned using the TA-cloning kit (Clontech) . Colonies were 
screened by PCR using gene-specific primers (SEQ ID NO: 22; and 
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SEQ ID NO;28, amisense corresponding to nucleotides 124-103 
in SEQ ID NO:l) . The clone containing the largest insert was 
sequenced. The nucleoside sequence corresponds to nuclectides 
1 to 48C in SEQ ID NO:24. It is clear from this sequence that 
5 an in-frame upstream translation initiation codon is present 

(position 77-79 in SEQ ID NO:24). Upstream of this 
translational starrcodon an in-frame stop-codon is present 
(11-13 in SEQ ID NO:24) . Consequently, the reading frame of 
the novel estrogen receptor is 530 amino acids, and the 

10 receptor has a calculated molecular mass of approximately 59 

^ kD. 

To confirm the nucleotide sequences obtained with 5' RACE, 
human genomic clones were isolated and analysed. A human 
genomic library in XEM3L3 (Clontech HL1067J) was screened with 

15 a probe corresponding to nucleotides 1 to 416 in SEQ ID NO:l. 

A strongly hybridizing clone was plaque-purified and DNA was 
isolated using standard protocols (Sambrook et al, 1989) - The 
DNA was digested with several restriction enzymes, 
electrophoresed on agarose gel and blotted onto Nylon filters. 

20 Hybridisation of the blot with a probe corresponding the 

above-mentioned RACE fragment (nucleotides 1-480 in SEQ ID 
NO: 24) revealed a hybridizing Sau3A fragment of approximately 
SCO basepairs. This fragment was cloned into the BamHl site of 
pGEM3z and sequenced. The nucleotide sequence was identical to 

: the sequence of the 5 ' RACE fragments except for one base 

difference which is probably a PCR-induced point mutation. 
Nucleoside 172 was a G-residue in the 5' RACE fragment but an A 
residue in several independent genomic subclones. 

B. ld&niilflcat±OTi of two splice variants of the novel 
30 estjrogen receptor. 

Rescreening of the testis cDNA library with a probe 
corresponding to nucleotides 917 to 1248 in SEQ ID No. 1 
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yielded two hybridizing clones, the 3' end of which were 
amplified by PGR (gene-specific primer #8: 5 r - 

GGAAGCTGGC7CACTTGCTG-3 r together with primer #4), cloned and 
sequenced. One clone was shown to contain an alternative exon 
5 8 (exon 83) of the novel ER. As a consequence of the 

introduction of this exon through a specific alternative 
splicing reaction, the reading frame encoding the novel ER is 
immediately terminated, thereby creating a truncation of the 
carboxyterminus of the novel ER. 

io Screening of a human thymus cDNA library (Clontech HLlC74a) 

with the probe corresponding to nucleotides 935 to 1266 in SEQ 
ID No. 1, revealed another splice variant- The 3' end of one 
hybridizing clone was amplified using primer #8 with the 
JtgtiO-specific primer #13 5' -AGCAAGTTCAGCCTGTTAAGT-3' (SEC. ID 

is NO: 19). cloned in the FCRII vector and sequenced. The obtained 

nucleotide sequence upstream of the exon 7/exon 8 boundary 
were identical to the clones identified earlier. However, an 
alternative exon 8 (exon 8C) was present at the 3' end 
encoding two C^terminal amino acids followed by a stop-codon. 

20 These two variants of the novel estrogen receptor do not 

contain the AF-2 region and therefore probably lack the 
ability to modulate transcription of target genes in a ligand- 
dependent fashion. However, the variants potentially could 
interfere with the functioning of the wild-type classic EH 
and/or tne wild-type novel ER, either by heterodimeri zation or 
by occupying estrogen response elements. A mutant of the 
classic ER (ER1-530) has been described which closely 
resembles the two variants of the novel estrogen receptor 
described above. ER1-53C has been shown to behave as a 

30 dominant-negative receptor i.e. it can block the intracellular 

activity of the wild type ER (Ince et al, J. Biol. Chem. 2 68 , 
14026-14032, 1993) . 
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C. Northern blot analysis* 

Human multiple tissue Northern blots (MTN-biots) were 
purchased frnin Clontech and pr ehybridi zed for at least 1 hour 
at 65°C in 0.5 M phosphate buffer pR 7.5 with 7% SDS- DNA 
fragments that were used as probes were 32P-labeled using a 
labelling kit: (Ambion) , denatured by boiling and added to the 
prehybridisation solution. Washing conditions were: 3X SSC at 
room temperature, followed by 3 X SSC at 65°C, and finally 1 X 
SSC at 65 ft C- The filters were than exposed to X-ray films for 
one week. Two transcripts of approximately 8 kb and 10 Jcb were 
detected in thymus, spleen, ovary and testis. In addition, a 
1.3 kb transcript is detected in testis. 

D. ^ig-a.nd.-depenojent transcription activation by the novel 
estrog'Ti.i receptor protein. 

Cell culture 

Chinese Hamster Ovary (CHO Kl) cells were obtained from 
ATCC (CCT.61 ) and maintained at 37°C in a humidified atmosphere 
{5% C0 2 ) as a monolayer culture in f enolred-f ree M505 medium. 
The latter medium consists of a mixture (1:1) of Dulbecco's 
Modified Eagle's Medium (DMEM, Gibco 074-200) and Nutrient. 
Medium F12 (Ham's F12, Gibco 074-1700) supplemented with 2.5 
rag/ml sodium carbonate (Baker) , 55 yiq/ml sodium pyruvate 
(Fluka), 2.3 jig/ml p-mercaptoethanol (Baker), 1,2 (Lig/mi 
ethanolami.ne (Raker), 360 jig/ml L-glutamine (Merck), 0,45 
^ig/ml sodium selenite (Fluka), 62.5 (.ig/ml penicillin 
(Mycopharm) , 62.5 jig/ml s t rept omycin (Serva), and 5% charcoal- 
treated bovine calf serum (Hyclo ne ) . 

Recombinant vectors 

The novel ER encoding sequence as presented in SEQ ID No 1 
was amplified by PCR using oligonucleotides 5'- 
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CTTGGATCCATAGCCCTGCTGTGATGAATTACAG-3' (SEQ ID NO:22) 
(underlined is the translation initiation codon) and : 5'- 
GAT G GAT C C T C AC C T C AG G G C C AG G C G T C AC T G - 3 ' (SEQ ID NO: 23) 
(underlined is the translation stopcodon, antisense) . The 
resulting BamHl fragment (approximately 1450 base pairs) was 
then cloned in the mammalian cell expression vector pNGVl 
behind the SV4 0 early promoter. In addition, this vector 
contains the IgG and Mulv enhancers. 

The reporter expression vector was based on the rat 
oxytocin gene regulatory region (position -363/+1 6 as a 
Hindlll/ Mbol fragment; R.Tvell, and U.Ri.chter, 
Proc. Natl. Acad. Sci.USA _81, 2006-2010, 1984) linked to the 
firefly lucifera.se encoding sequence; the regulatory region of 
the oxytocin gene were shown to possess functional estrogen 
hormone response elements in vitro for both the rat (R.Adan, 
N.Walther, J.Cox, R.Ivell, and P.Burbach, 

Biochem. Biophys . Res . Coram. 175 , 117-122, 1991) and the human 
(S.Richard, and H.Zingg, J.Biol .Chem. 265 , 6098-6103, 1990). 

Tr ansient trans f ect ion 

1 x 10* CHO cells were seeded in 6-wells Nunclon tissue 
culture plates and DNA was introduced by use of lipofectin 
(Gibco BRL) . Hereto, the DNA (1 [ig of both receptor and 
reporter vector in 250 [iL Optimem, Gibco BRL) was mixed with 
an equal volume of lipofectin reagent (7 |_lL in 250 |iL Optimem, 
Gibco) and allowed to stand at room temperature for 15 min. 
Afrer washing the ceils twice with serum-free medium (M505) 
new medium (500 (J.L Optimem, Gibco) was added to the cells 
followed by the dropwise addition of the DNA-lipof ectin 
mixture. After incubation for a 5 hour period at 37°C cells 
were washed twice with f enolred-f ree M505 + 5% charcoal- 
treated bovine calf serum and incubated overnight at 37°C. 
After 24 hours hormone ( 17p-estradiol , etriol or estron) was 
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added to the medium (100 nir.ol/L) . Cell extracts were made 48 
hours posttransfection by the addition of 200 p.L lysisbuffer 
(0,1 M phosphate buffer pK7.8, 0.2% Triton X-100), After 
incubation for 5 mi.n at 37°C the cell suspension was 
oentrifuged (Eppendorf centrifuge, 5 mill) and 20 pX sample was 
added :o 50 p.L luciferase assay reagent (Promega) . Light 
emission was measured in a lummometer (3erthold Biolumat) for 
10 sec at 562 nn. 

Results - 



CHO cells transiently transfected with the novel ER 
expression vector and a reporter plasmid showed a 3 to 4 fold 
increase in luciferase activity in respons to 17-beta- 
estradiol as compared to untreated cells. A similar 
trans activation was obtained upon treatment with estriol and 
estrone. The results indicate not only that the novel ER can 
bind estrogen hormones but also that the ligand-activated 
receptor can bind to the ERE within the rat oxytocin promoter 
and activate transcription of the luciferase reporter gene. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

5 

(i) APPLICANT: 

(A) NAME: Akzo nobel n.v. 

(B) STREET: Velperweg 76 

(C) CITY: Arnhem 

10 (E) COUNTRY: The Netherlands 

(F) POSTAL CODE (ZIP) : 6624 BM 

(G) TELEPHONE: 0412-666379 

(H) TELEFAX: 0412-650592 
CI) TELEX: 37503 akpha nl 

] " 

^ (iij TITLE OF INVENTION: Novel esrrogen receptor 

(iii) NUMBER OF SEQUENCES: 2 6 

20 Civ) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE; Floppy disk 
(S) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS -DOS 

(DJ SOFTWARE: Patentln Release #1.0, Version #1.30 ( EPO) 

25 

^ (2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 
3u (A) LENGTH: 1434 base pairs 

(B) TYPE: nucleic acid 

(C) ST HANDEDNESS : double 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: cDNA 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
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ATGAATTACA GCATTCCCAG CAATGTCACT AACTTGGAAG 6TGGGCCTGG TCGGCAGACC 60 

ACAAGCCCAA ATGTGTTGTG GCCAACACCT GGGCACCTTT CTCCTTTAGT GGTCCATCGC 12 0 

5 

CAGTTATCAC ATCTGTATGC GGAACCTCAA AAGAGTCCCT GGTGTGAAGC AAGATCGCTA 180 

GAAC AC AC CT TACCTGTAAA CAGAGAGACA CTGAAAAGGA AGGTTAGTGG GAACCGTTGC 24 0 

10 GCCAGCCCTG TTACTGGTCC AGGT T CAAAG AGGGATGCTC ACTTCTGCGC TGTCTGCAGC 300 

GATTACGCAT CGGGATATCA CTATGGAGTC TGGTCGTGTG AAGGATGTAA GGCCTTTTTT 3 60 

AAAAGAAGCA TTCAAGGACA T AAT GATT AT ATTTGTCCAG CTACAAATCA GTGTACAATC 42 0 

GATAAAAACC GGC GCAAGAG CTGCCAGGCC TGCCGACTTC GGAAGTGTTA CGAAGTGGGA 480 

ATGGTGAAGT GTGGCTCCCG GAGAGAGAGA TGTGGGTACC GCCTTGTGCG GAGA CA GAGA 54 0 

20 AGTGCCGACG AGCAGCTGCA CTGTGCCGGC AAGGCCAAGA GAAGTGGCGG CCACGCGCCC 600 

CGAGTGCGGG AGCTGGTGCT GGACGCCCTG AGCCCCGAGC AGCTAGT GCT CACCCTCCTG 660 

GAGGCTGAGC CGCCCCATGT GCTGATCAGC CGCCCCAGTG CGCCCTTCAC CGAGGCCTCC 720 

25 

AT GAT GAT GT CCCTGACCAA GTTGGCCGAC AAGGAGTTGG TACACATGAT CAGCTGGGCC 780 

AAGAAGATTC CCGGCTTTGT GGAGCTCAGC CTGTTCGACC AAGTGCGGCT GTTGGAGAGC 8 40 

3C TGTTGGATGG AGGTGTTAAT GATGGGGCTG AT GT GGCGCT CAATTGACCA CCCCGGCAAG 900 



35 



CTCATCTTTG CTCCAGATCT TGTTCTGGAC AG G GAT GAG G GGAAATGCGT AGAAG GAATT 960 

CTGGAAATCT TTGACATGCT CCTGGCAACT ACTTCAAGGT TTCGAGAGTT AAAACTCCAA 102 0 

CACAAAGAAT ATCTCTGTGT CAAGGCCATG ATCCTGCTCA AT T C C AG TAT GTACCCTCTG 10B0 

GTCACAGCGA CCCAGGATGC TGACAGCAGC CGGAAGCTGG CTCACTTGCT GAACGCCGTG 114 0 
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ACCGATGCTT TGGTTTGGGT GATTGCCAAG AGCGGCATCT CCTCCCAGCA GCAATCCATG 12 0 0 

CGCCTGGCTA AC CT C CT GAT GCTCCTGTCC CACGTCAGGC ATGCGAGTAA CAAGGGCATG 12 60 

GAACATCTGC TCAACATGAA GTGCAAAAAT GTGGTCCCAG TGTATGACCT GCTGCTGGAG 132 0 

ATGCTGAATG CCCACGTGCT TCGCGGGTGC AAGTCCTCCA TCACGGGGTC CGAGTGCAGC 13 80 

CCGGCAGAGG ACAGTAAAAG CAAAGAGGGC TCCCAGAACC CACAGTCTCA GTGA 14 34 
(2) INFORMATION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1251 base pairs 
; > (B) TYPE: nucleic acid 

(C) 5TRANDEDNE55: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

20 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
25 ATGAATTACA GCATTCCCAG CAAT GTCACT AACTTGGAAG GTGGGCCTGG TCGGCAGACC 60 

ACAAGCCCAA ATGTGTTGTG GCCAACACCT GGGCACCTTT CTCCTTTAGT GGTCCATCGC 12 0 

CAGTTATCAC AT CT GT AT G C GGAACCTCAA AAGAGT C C CT GGTGTGAAGC AAGATCGCTA 160 

30 

GAACAC AC CT TAC CT GT AAA CAGAGAGACA CT GAAAAGGA AGGTTAGTGG GAACCGTTGC 2 40 

GCCAGCCCTG TTACTGGTCC AGGTTCAAAG AGGGATGCTC ACTTCTGCGC TGTCTGCAGC 300 

35 GAT TAC GC AT CGGGATATCA CTATGGAGTC TGGTCGTGTG AAGGATGTAA GGCCTTTTTT 360 

AAAAGAAGCA TTCAAGGACA TAATGATTAT ATTTGTCCAG CTACAAATCA G T GT A CAAT C 42 0 

GAT AAAAAC C GGCGCAAGAG CTGCCAGGCC TGCCGACTTC GGAAGTGTTA CGAAGTGGGA 480 
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ATGGTGAAGT GTGGCTCCCG GAGAGAGAGA TCTGGGTACC GCCTTGTGCG GAGACAGAGA 54 0 

AGTGCCGACG AGCAGCTGCA CTGTGCCGGC AAGGCCAAGA GAAGTGGCGG CCACGCGCCC 600 

CGAGTGCGGG AGCTGCTGCT GGACGCCCTG AGCCCCGAGC AGCTAGTGCT CACCCTCCTG 660 

GAGGCTGAGC CGCCCCATGT GCTGATCAGC CGCCCCAGTG CGCCCTTCAC CGAGGCCTCC 72 0 

AT GAT GAT GT CCCTGACCAA GTTGGCCGAC AAGGAGTTGG TACACATGAT CAGCTGGGCC 7 8C 

AAGAAGATTC CCGGCTTTGT GGAGCTCAGC CTGTTCGACC AAGTGCGGCT CTTGGAGAGC 840 

TGTTGGATGG AGGTGTTAAT GATGGGGCTG ATGTGGCGCT CAATTGACCA CCCCGGCAAG 9 00 

CTCATCTTTG CTCCAGATCT TGTTCTGGAC AGGGATGAGG GGAAATGCGT AGAAGGAATT 9 60 

CTGGAAATCT TTGACATGCT CCTGGCAACT ACTTCAAGGT TTCGAGAGTT AAAACTCCAA 102 0 

CACAAAGAAT ATCTCTGTGT CAAGGCCATG ATCCTGCTCA AT T C CAG TAT GTACCCTCTG 106 0 

GTCACAGCGA CCCAGGATGC TGACAGCAGC CGGAAGCTGG CTCACTTGCT GAACGCCGTG 114 0 

ACCGATGCTT TGGTTTGGGT GATTGCCAAG AGCGGCATCT CCTCCCAGCA GCAATCCATG 1200 

CGCCTGGCTA ACCTCCTGAT GCTCCTGTCC CACGTCAGGC ATGCGAGGTG A 1251 
(2) INFORMATION FOR SEQ ID NOi 3: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 66 airdno acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Cys Ala Val Cys Ser Asp Tyr Ala Scr Gly Tyr Kis Tyr Gly Val Trp 
15 10 IS 

Ser Cys Glu Gly Cys Lys Ala Phe Phe Lys Arg Ser He Gin Gly His 
20 25 30 

Asn Asp Tyr He Cys Pro Ala Thr Asn Gin Cys Thr He Asp Lys Asn 
35 40 45 

Arg Arg Lys Ser Cys Gin Ala Cys Arg Leu Arq Lys Cys Tyr Glu Val 
50 55 60 



} Gly Met 

1 

65 



(2) INFORMATION FOR SEQ ID NO: 4: 

20 U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

L*j. Val Lau Thr Leu Leu Glu Ala Glu Pro Pro His Val Leu He Ser 

15 10 15 

35 Arq Pro Ser Ala Pro Phe Thr Glu Ala Ser Met Met Met Ser Leu Thr 

20 25 30 

Lys Leu Ala Asp Lys Glu Leu Val His Met lie Ser Trp Ala Lys Lys 

35 40 45 



0 Ti t v a n s s t 1 1 j d 
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lie Pro Gly Phe Val Glu Leu Ser Leu Phe Asp Gin Val Arg Leu Leu 
50 55 60 

Glu Ser Cys Trp Met Glu Val Leu Met Met Gly Leu Met Trp Arq S^r 
65 70 75 80 

lie Asp His Pro Gly Lys Leu He Phe Ala Pro Asp Leu Val Leu Asp 
85 90 95 

Arg Asp Glu Gly Lys Cys Val Glu Gly He Leu Glu He Phe Asp Met 
100 105 HO 



1 r 



Leu Leu Ala Thr Thr Ser Arg Phe Arg Glu Leu Lys Leu Gin His Lys 
115 120 125 



Glu Tyr Leu Cys Val Lys Ala Met Tie Leu Leu Asn Ser Ser- Met Tyr 
130 135 140 



20 



Pro Leu Val Thr Ala Thr Gin Asp Ala Asp Ser Ser Arg Lys Leu Ala 
145 150 155 160 



25 



His Leu Lea Asn Ala Val Thr Asp Ala Leu Val Trp Val He Ala Lys 
165 170 175 

Ser Gly He Ser Ser Gin Gin Gin Ser Met Arg Leu Ala Asn Leu Leu 
180 185 190 



3 b 



Met Leu Leu Ser His Val Arg His Ala Ser Asn Lys Gly Met Glu His 
195 200 205 



Leu Leu Asn Met Lys Cys Lys Asn Val Val Pro Val Tyr Asp Leu Leu 
210 215 220 



35 



Leu Glu Met Leu Asn Ala His Val Leu 

225 230 



(2) INFORMATION FOR SEQ ID NO: 5: 



Ontvangs t 1 1 jd 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 4 77 and. no acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

Li) MOLECULE TYPE: protein 



10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



Ket Asn Tyr Ser lie Pro S<=r Asn Val Thr Asn Leu GJ u Gly Gly Pro 

15 10 15 

15 

) Gly Arg Gin Thr Thr Ser Pro Asn Val Leu Trp Pro Thr Pro Gly His 

20 25 30 

Leu Ser Pro Leu Val Val His Arg Gin Leu Ser His Leu Tyr Ala Glu 
20 35 40 45 

Pro Gin Lys Ser Pro Trp Cys Glu Ala Arq Ser Leu Glu His Thr Leu 
bC 55 60 

25 Pro Val Asn Arg Glu Thr Ltiu Lys Arg Lys Val Ser GJ y Asn Arg Cys 

65 70 75 80 

Ala Ser Pro Val Thr Gly Pro Gly Ser Lys Arg Asp Ala His Phe Cys 
35 50 95 

3 .- 

Ala Val Cys Ser Asp Tyr Ala Ser Gly Tyr His Tyr Gly Val Trp Ser 
100 105 110 

Cys Glu Gly Cys Lys Ala Phe Phe Lys Arg Ser lie Gin Gly His Asn 
35 115 120 125 

Asp Tyr lie Cys Pro Ala Thr Asn Gin Cys Thr lie Asp Lys Asn Arc; 
130 135 140 



Ontvangst tijd 
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Arg Lys Ser Cys Gin Ala Cys Arq Leu Arg Lys Cys Tyr Glu Val Gly 
145 150 155 160 

Met Val Lys Cys Gly Ser Arg Arg Glu Arg Cys Gly Tyr Arg Leu Val 
5 165 170 175 

Arg Arg Gin Arg Ser Ala Asp Glu Gin Leu His Cys Ala Gly Lys Ala 
180 185 190 

10 Lys Arg Ser Gly Gly His Ala Pro Arg Val Arg Glu Leu Leu Leu Asp 

195 200 205 

Ala Leu Ser Pro Glu Gin Leu Val Leu Thr Leu Leu Glu Ala Glu Pro 
210 215 220 



IS 

\ 



Pro His Val Leu lie Ser Arg Pro Ser Ala Pro Phe Thr Glu Ala Ser 
225 230 235 240 



Mer Met Met Ser Leu Thr Lys Leu Ala Asp Lys Glu Leu Val His Met 
20 245 250 255 

lie Ser Trp Ala Lys Lys He Pro Gly Phe Val Glu Leu Ser Leu Phe 
260 265 270 

25 Asp Gin Val Arg Leu Leu Glu Ser Cys Trp Met Glu Val Leu Met Met 

275 260 285 

Gly Leu Met Trp Arg S<-r He Asp His Pro Gly Lys Leu He Phe Ala 

290 295 300 

3u 

Pro Asp Leu Val Leu Asp Arg Asp Glu Gly Lys Cys Val Glu Gly lie 

301 310 315 320 

Leu Glu He Phe Asp Met Leu Leu Ala Thr Thr Ser Arg Phe Arg Glu 
35 325 330 335 

Leu Lys Leu Gin His Lys Glu Tyr Leu Cys Val Lys A] a Met He Leu 
340 345 350 



Cntvangs: t;jd 
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Leu Asn Ser S^r Met Tyr Pro Leu Val Thr Ala Thr Gin Asp Ala Asp 

355 360 365 

Ser Ser Arg Lys Leu Ala His Leu Leu Asn A].a Val Thr Asp Ala Leu 

370 375 380 

Val Trp Val lie Ala Lys Ser Gly lie Ser Ser GJn Gin Gin Ser Met 
365 390 395 400 

Arc; Leu Ala Asn Leu Leu Met Leu Leu Ser His Va ] Arg His Ala Ser 
405 410 415 

Asr; Lys Gly Met Glu His Leu Leu Asn Met Lys Cys Lys Asn Val Val 
420 425 430 

Pro Val Tyr Asp Leu Leu Leu Glu Met Leu Asn Ala His Val Leu Arg 

435 440 445 



Gly Cys Lys Ser Ser lis Thr Gly Ser Glu Cys Ser Pro Ala Glu Asp 
20 450 455 460 

Ser Lys Ser Lys Glu Gly Ser Gin Asn Pro Gin Ser Gin 
465 470 475 

25 (2 J INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 416 amino acids 

(B) TYPE: amino acid 

3o (C) STRANDEDNE55: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

35 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Asn Tyr Ser lie Pro Ser Asn Val Thr Asn Leu Glu Gly Gly Pro 



0:. tvsr.gs t 1 1 j d 
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lb 10 15 

Gly Arg Gin Thr Thr Ser Pro Asn Val Leu Trp Pro Thr Pro Gly Kis 
20 25 30 

5 

Leu Ser Pro Leu Val Val His Arg Gin Leu Ser His Leu Tyr Ala G; u 
35 40 45 

Pro Gin Lys Ser Pro Trp Cys Glu Ala Arg Ser Leu Glu His Thr Leu 
10 50 55 60 

Pro Val Asn Arg Glu Thr Leu Lys Arg Lys Val Ser Gly Asn Arg Cys 
65 70 75 BO 

I 5 Ala Ser Pro Val Thr Gly Pro Gly Ser Lys Arg Asp Ala His Phe Cys 

85 90 95 

Ala Val Cys Ser Asp Tyr Ala Ser Gly Tyr His Tyr Gly Val Trp Ser 
100 105 110 

20 

Cys Glu Gly Cys Lys Ala Phe Phe Lys Arg Ser lie Gin Gly His Asn 
lib 120 125 

Asp Tyr Iln Cys Pro Ala Thr Asn Gin Cys Thr He Asp Lys Asn Arg 
25 130 135 140 

Arg Lys Ser Cys Gin Ala Cy3 Arq Leu Arg Lys Cys Tyr Glu Val Gly 
145 150 155 160 

30 Met Val Lys Cys Gly Ser Arg Arg Glu Arg Cys Gly Tyr Arg Leu Val 

165 170 175 

Arq Arg Gin Arg Ser Ala Asp Glu Gin Leu Kis Cys Ala Gly Lys Ala 
180 185 ISO 



35 



Lys Arg Ser Gly Gly His Ala Fro Arq Val Arg Glu Leu Leu Leu Asp 
195 200 205 

Ala Leu Ser Pro Glu Gin Leu Val Leu Thr Leu Leu Glu Ala Glu Pro 



Ontvangst tijd 
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210 21b 220 

Pro Kis Val Leu lie Ser Arq Pro Ser Ala Pro Phe Thr Glu Ala Ser 
225 230 235 240 

5 

Kez Met Met Ser Leu Thr Lys Leu Ala Asp Lys Glu Leu Val Kis Met 
245 250 255 

lie Ser Trp Ala Lys Lys He Pro Gly Phe Val Glu Leu Ser Leu Phe 
10 260 265 270 

Asp Gin Val Arg Lea Leu Glu Ser Cys Trp Met Glu Val Leu Met Met 
275 260 23b 

l c ^ Gly Leu Met Trp Arq Ser lie Asp His Pro Gly Lys Leu He Phe Ala 

290 295 300 

Pro Asp Leu Val Leu Asp Arg Asp Glu Gly Lys Cys Val Glu Gly lie 
305 310 315 320 

20 

Leu Glu He Phe Asp Met Leu Leu Ala Thr Thr Ser Arg Phe Arg Glu 
325 330 335 

Leu Lys Leu Gin His Lys Glu Tyr Leu Cys Val Lys Ala Met He Leu 
25 340 345 350 

Leu Asn Ser Ser Met Tyr Pro Leu Val Thr Ala Thr Gin Asp Ala Asp 
355 360 365 

30 Ser Ser Arg Lys Leu Ala His Leu Leu Asn Ala Val Thr Asp Ala Leu 

370 375 380 

Val Trp Val He Ala Lys Ser Gly He Ser Ser Gin Gin Gin Ser Met 
385 390 395 400 



35 



Arg Leu Ala Asn Leu Leu MeC Leu Leu Ser His Val Axg His Ala Arg 
405 410 415 
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(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

15 GGIGAYGARG CWTCIGGITG YCAYTAYGG 29 

i 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 
{C} STRANDEDNESS : single 
(D) TOPOLOGY: linear 

2 5 (il) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 8: 

3u 

AAGCCTGGSA YICKYTTIGC CCAIYTIAT 29 
(2) INFORMATION FOR SEQ ID NO: 9: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : linear 
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(ii) MOLECULE TYPE: cDNA 



{Xi s SEQUENCE DESCRIPTION; SEQ ID NO: 9: 



121043 



TGTTACGAAG TGGGAATGGT GA 



22 



10 



(2) INFORMATION FOR SEQ ID NO: 10: 



15 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY i linear 



20 



25 



(iij MOLECULE TYPE: cDNA 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



TTGACACCAG ACCAACTGGT AATG 



(2) INFORMATION FOR SEQ ID NO: 11: 



24 



3„ 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 



(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



Ontvangst 1 1 j d 



22. nov, 17:21 



Af d r uk t i j d 



22. n:v. 17:45 



22/11 96 VRI IV: 40 FAX ^31 412 650592 N.U. ORGANON 

+3 1 412 650592 



»-> EPO RY5WYK 



E]044 



- 36 - 



GGTGGCGACG ACTCCTGGAG CCCG 



24 



10 



(2 J INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS ; single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



15 



(x.i) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



GTACACTGAT TTGTAGCTGG AC 



22 



20 



25 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS ; single 
ID) TOPOLOGY: linear 



30 



(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: j.3: 



C CAT GAT GAT GTCCCTGACC 



20 



35 



(2) INFORMATION FOR SEQ ID NO: 14: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2 0 base pairs 
(3) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 

{ ii ) MOLECULE TYPE; CDKA 



(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 14: 



10 



TCGCATGCCT GACGTGGGAC 



20 



(2) INFORMATION FOR SEQ ID NO: lb: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: sinqle 

(D) TOPOLOGY: linear 



20 



(iij MOLECULE TYPE: cDNA 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: lb: 



GGCSTCCAGC ATCTCCAGSA RCAG 



24 



(2) INFORMATION FOR SEQ ID NO: 16: 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



Cr.ivangst t : j d 
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(xii SEQUENCE DESCRIPTION": SFQ ID MO: 16: 



GGAAGCTGGC TCACTTGCTG 



20 



(2) INFORMATION FOR SEQ ID NO: 17: 



10 



(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



15 



20 



(li) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



TCTTGTTCTG GACAGG GAT G 



(2) INFORMATION FOR SEQ ID NO: 18: 



20 



25 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



jO 



(ii) MOLECULE TYPE: cDNA 



(xi.) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



35 



GCATGGAACA TCTGCTCAAC 



20 



(2) INFORMATION FOR SEQ ID NO: 19 1 



(1) SEQUENCE CHARACTERISTICS: 



Ontvan&s t t : j d 
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(A) LENGTH: 21 base pairf 

(S) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(3) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



10 



(XI) SEQUENCE DESCRIPTION: 5EQ ID NO: 19: 



AGCAAGTTCA GCCTGTTAAG T 



21 



(2) INFORMATION FOR SEQ ID NO: 20: 



20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1257 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



25 



(ii) MOLECULE TYPE: cDNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 



ATGAATTACA GCATTCCCAG CAAT GT CACT AACTTGGAAG GTGGGCCTGG TCGGCAGACC 



60 



30 



ACAAGCCCAA ATGTGTTGTG GCCAACACCT GGGCACCTTT CTCCTTTAGT GGTCCATCGC 12 0 



CAGTTATCAC ATCTGTATGC GGAACCTCAA AAGAGTCCCT GGTGTGAAGC AAGATCGCTA 18 0 



35 



GAACACACCT T AC C T GT AAA CAGAGAGACA CTGAAAAGGA AGGTTAGTGG GAACCGTTGC 24 0 



GCCAGCCCTG TTACTGGTCC AGGTTCAAAG AGGGATGCTC ACTTCTGCGC TGTCTGCAGC 



300 



GAT T AC G CAT CGGGATATCA CTATGGAGTC TGGTCGTGTG AAGGAT GTAA GGCCTTTTTT 3 60 



.van? 
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AAAAGAAGCA TTCAAGGACA TAATGATTAT ATTTGTCCAS CTACAAATCA GTGTACAATC 42 0 

GATAAAAACO GGCGCAAGAG CTGCCAGGCC TGCCGACTTC GGAAGT GTTA CGAAGTGGGA 4 60 

ATGGTGAAGT GTGGCTCCCG GAGAGAGAGA TGTGGGTACC GCCTTGTGCG GAGACAGAGA 54 0 

AGTGCCGACG AGCAGCTGCA CTGTGCCGGC AAGGCCAAGA GAAGTGGCGG CCACGCGCCC 600 

CGAGTGCGGG AGCTGCTGCT GGACGCCCTG AGCCCCGAGC AGCTAGTGCT CACCCTCCTG 660 

GAGGCTGAGC CGCCCCATGT GCTGATCAGC CGCCCCAGTG CGCCCTTCAC CGAGGCCTCC 720 

AT GAT GAT GT CCCTGACCAA GTTGGCCGAC AAGGAGTTGG TACACATGAT CAGCTGGGCC 7 80 

AAGAAGATTC CCGGCTTTGT GGAGCTCAGC CTGTTCGACC AAGTGCGGCT CTTGGAGAGC 840 

TGTTGGATGG AGGTGTTAAT GATGGGGCTG ATGTGGCGCT CAATTGACCA CCCCGGCAAG 900 

CTCATCTTTG CTCCAGATCT TGTTCTGGAC AGGGATGAGG GGAAATGCGT AGAAGGAATT 960 

CTGGAAATCT TTGACATGCT CCTGGCAACT ACTTCAAGGT TTCGAGAGTT AAAACTCCAA 102 0 

CACAAAGAAT ATCTCTGTGT CAAGGCCATG ATCCTGCTCA ATTCCAGTAT GTACCCTCTG 108 0 

25 GTCACAGCGA CCCAGGATGC TGACAGCAGC CGGAAGCTGG CTCACTTGCT GAACGCCGTG 114 0 

P ACCGATGCTT TGGTTTGGGT GATTGCCAAG AGCGGCATCT CCTCCCAGCA GCAATCCATG 12 0 0 

, CGCCTGGCTA ACCTCCTGAT GCTCCTGTCC CACCTCAGGC ATGCGAGGTC TGCCTGA 12^7 
(2) INFORMATION FOR 5EQ ID NO: 21: 



20 



30 



U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 418 amino acids 
35 (3) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(Ki) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Met Asn Tyr Ser He Pro Ser Asn Val Thr Asn Leu Glu Gly Gly Pro 
1 5 3.0 15 



10 



G^y Arg Gin Thr Thr Ser Pro Asn Val Leu Trp Pro Thr Pro Gly His 
20 25 30 



20 



Leu Ser Pro Leu Val Val His Arg Gin Leu Ser His Leu Tyr Ala Glu 
35 40 4b 

Pro Gin Lys Ser Pro Trp Cys Glu Ala Arg Ser LeU Glu His Thr Leu 
50 55 60 



Pro 



Val Asn Arg Glu Thr Leu Lys Arg Lys Val Ser Gly Asn Arq Cys 



70 



7 5 



B0 



Ala Ser Pro Val Thr Gly Pro Gly Ser Lys Arg Asp Ala His Phe Cys 
85 90 95 



25 



Ala Val Cys Ser Asp Tyr Ala Ser Gly Tyr His Tyr Gly Val Trp Ser 
100 105 HO 



Cys Glu Gly Cys Lys Ala Phe Phe Lys Arq Ser He Gin Gly His Arn 

115 120 125 



30 



Asp Tyr He Cys Pro Ala Thr Asn Gin Cys Thr lie Asp Lys Asn Arg 
130 135 140 



35 



Arg Lys Ser Cys Gin Ala Cys Arg Leu Arg Lys Cys Tyr Glu Val Gly 

;*5 150 155 160 

Met Val Lys Cys Gly Ser Arq Arg Glu Arg Cys Gly Tyr Arg Leu Val 

165 170 175 



Arq Arg Gin Arg Ser Ala Asp Glu Gin Leu His Cys Ala Gly Lys Ala 
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180 185 190 

Lys Arg Ser Gly Gly His Ala Pro Arg Val Arg G.lu ,T,eu Leu Leu Asp 
195 200 20b 

5 

Ala Leu Ser Pre Glu Gin Leu Val Leu Thr Leu Leu Glu Ala Glu Pro 
210 215 220 

Pro His Val Leu lie Ser Arg Pro Ser Ala Pro Phe Thr Glu Ala Ser 
10 225 230 235 240 

Met Met Met Ser Leu Thr Lys Leu Ala Asp Lys Glu Leu Val His Met 
^ 245 250 25b 

aV He Ser Trp Ala Lys Lys He Pro Gly Phe Val Glu Leu Ser Leu Phe 

) 260 265 270 

Asp Gin Val Arg Leu Leu Glu Ser Cys Trp Met Glu Val Leu Met Met 
275 280 285 

20 

Gly Leu M«t Trp Arg Ser lie Asp His Pro Gly Lys Leu He Phe Ala 
290 295 300 

Pro Asp Leu Val Leu Asp Arg Asp Glu Gly Lys Cys Val Glu Gly lie 
25 305 310 315 320 

LtiU Glu lie Phfe Asp M*st Leu Leu Ala Thr Thr Ser Arg Phe Arg Glu 
325 330 335 

30 Leu Lys Leu Gin His Lys Glu Tyr Leu Cys Val Lys Ala Met He Leu 

340 345 350 

Le^ Asn Ser Ser Met Tyr Pro Leu Val Thr Ala Thr Gin Asp Ala Asp 
355 360 365 



35 



Ser Ser Arc Lys Leu Ala His Leu Leu Asn Ala Val Thr Asp Ala Leu 
370 375 380 

Val Trp Val He Ala Lys Ser Gly He Ser Ser Gin Gin Gin Ser Met 
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385 390 39b 400 

Arg Leu Ala Asn Leu Leu Met Leu Leu Str His Val Arg His Ala Arg 
405 410 415 

Ser Ala 
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(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 34 base pairs 
(3) TYPE : nucleic acid 
(C) STRANDEDNESS : single 

■j. 5 { D ) TOPOLOGY: linear 

/) 

(ii) MOLECULE TYPE; cDNA 



« 20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
CTTGGATCCA TAGCCCTGCT GTGATGAATT ACAG 34 
25 (2) INFORMATION FOR SEQ ID NO: 23: 

^ (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

3 0 "' (C) STRANDEDNESS: single 

(JJ) TOPOLOGY: linear 

(iij MOLECULE TYPE: cDNA 



35 



fxi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
GAT G GAT C CT CACCTCAGGG CCAGGCGTCA CTG 3 3 
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(2] INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1696 base pairs 
( B } TYPE: nucleic acid 
(C) STRANDEDNESS: single 
ID) TOPOLOGY: linear 



10 



(ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



CACGAATCTT TGAGAACATT ATAATGACCT TTGTGCCTCT TCTTGCAAGG TGTTTTCTCA 



60 



20 



GCTGTTATCT CAAGACATGG ATATAAAAAA CTCACCATCT AGCCTTAATT CTCCTTCCTC 12 0 

CTACAACTGC AGTCAATCCA TCTTACCCCT GGAGCACGGC T C CAT AT AC A TACCTTCCTC 16 0 



CTATGTAGAC AGCCACCATG AAT AT CCAGC CAT GACAT T C TATAGCCCTG CTGTGATGAA 24 0 



25 



T T AC AG CAT T CCCAGCAATG TCACTAACTT GGAAGGTGGG CCTGGTCGGC AGACCACAAG 300 



30 



CCCAAATGTG TTGTGGCCAA CACCTGGGCA CCTTTCTCCT TTAGTGGTCC ATCGCCAGTT 3 60 

AT C AC AT C T G TATGCGGAAC CTCAAAAGAG TCCCTGGTGT GAAGCAAGAT CGCTAGAACA 420 



CACCTTACCT GTAAACAGAS AGACACT GAA AAGGAAGGTT AGTGGGAACC GTTGCGCCAG 4 80 



CCCTGTTACT GGTCCAGGTT CAAAGAGGGA TGCTCACTTC TGCGCTGTCT GCAGCGATTA 54 0 



35 



CGCATCGGGA TAT C ACT AT G GAGTCTGGTC GTGTGAAGGA TGTAAGGCCT TTTTTAAAAG 



600 



AAGCATTCAA G GACAT AAT G ATTATATTTG TCCAGCTACA AATCAGTGTA CAATCGATAA 660 



A^ACCGGCGC AAGAGCTGCC AGGCCTGCCG ACTTCGGAAG TGTTACGAAG TGGGAATGGT 72 0 
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GAAGTG'i'GGC TCCCGGAGAG AGAGATGTGG GTACCGCCTT GTGCGGAGAC AGAGAAGTGC 7 80 

C G AC GAG CAG CTGCACTGTG CCGGCAAGGG CAAGAGAAGT GGCGGCCACG CGCCCCGAGT 64 0 

5 

GCGGGAGCTG CTGCTGGACG CCCTGAGCCC C GAGCAGCT A GTGCTCACCC TCCTGGAGGC 900 

TGAGCCGCCC CATGTGCTGA TCAGCCGCCC CAGTGCGCCC TTCACCGAGG CCT C CAT GAT 9 60 

10 GATGTCCCTG ACCAAGTTGG CCGACAAGGA GTTGGTACAC ATGATCAGCT GGGCCAAGAA 1020 

GATTCCCGGC TTTGTGGAGC TCAGCCTGTT CGACCAAGTG CGGCTCTTGG AGAGCTGTTG 108 0 

GATGGAGGTC4 TTAATGATGG GGCTGATGTG GCGCTCAATT GACCACCCCG GCAAGCTCAT 114 0 

-, r 

t 

CTTTGCTCCA GATCTTGTTC TGGACAGGGA TGAGGGGAAA TGCGTAGAAG GAATTCTGGA 1200 

AATCTTTGAC ATGCTCCTGG CAACTACTTC AAGGTTTCGA GAGTTAAAAC TCCAACACAA 1260 

20 AGAATATCTC TGTGTCAAGG C CAT GAT CCT GCTCAATTCC AGTATGTACC CTCTGGTCAC 1320 

AGCGACCCAG GATGCTGACA GCAGCCGGAA GCTGGCTCAC TTGCTGAACG CCGTGACCGA 13 8 0 

TGCTTTGGTT TGGGTGATTG CCAAGAGCGG CATCTCCTCC CAGCAGCAAT CCATGCGCCT 14 4 0 

25 

GGCTAACCTC CTGATGCTCC TGTCCCACGT CAGGCATGCG AGTAACAAGG GCATGGAACA 1500 

• 

TCTGCTCAAC ATGAAGTGCA AAAATGTGGT CCCAGTGTAT GACCTGCTGC TGGAGATGCT 1S60 

3 0 GAATGCCCAC GTGCTTCGCG GGTGCAAGTC CTCCATCACG GGGTCCGAGT GCAGCCCGGC 3.62 0 

AGAGGACAGT AAAAGCAAAG AGGGCTCCCA GAACCCACAG TCTCAGTGAC GCCTGGCCCT 168 0 

GAGGTGAACT GGCCCACAGA GGT CACAAGC TGAAGCGTGA ACTCCAGTGT GTCAGGAGCC 17 4 0 

TGGGCTTCAT CTTTCTGCTG TGTGGTCCCT CATTTGGTGA TGGCAGGCTT GGTCATGTAC 18 00 

CATCCTTCCC TCCACCTTCC CAACTCTCAG GAGTCGGTGT GAGGAAGCCA TAGTTTCCCT I8 60 
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TGTTAGCAGA GG GACATTTG AATCGAGCGT TTCCACAC 18 98 

(2) INFORMATION FOR 3EQ ID NO: 25: 

ti) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 530 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Met Asp lie Lys Asn Ser Pro Ser Ser Leu Asn Ser Pro Ser Ser Tyr 
15 10 IS 

Asn Cys Ser Gin Ser lie Leu Pro Leu Glu His Gly Ser lie Tyr lie 
20 25 30 

Pro Ser Ser Tyr Val Asp Ser His His Glu Tyr Pro Ala Met Thr Phe 
35 40 45 

Tyr Ser Pro Ala Val Met Asn Tyr Ser lie Pro Ser Asn Val Thr Asn 
50 55 60 

Le-; Glu Gly Gly Pro Gly Arq Gin Thr Thr Ser Pro Asn Val Leu Trp 
65 70 75 60 

Pro Thr Pro Gly His Leu Ser Pro Leu Val Val His Arq Gin Leu Ser 
85 90 95 

His Leu Tyr Ala Glu Pro Gin Lys Ser Pro Trp Cys Glu Ala Arg Ser 
100 1C5 110 

Leu Glu His Thr Leu Pro Val Asn Arg Glu Thr Leu Lys Arg Lys Val 
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lib 120 125 

Ser Gly Asn Arg Cys Ala 5er Pro Val Thr Gly Pro Gly Ser Lys Arg 
13C 135 14C 

5 

Asp Ala His Phe Cys Ala Val Cys Ser Asp Tyr Ala Ser Gly Tyr His 
145 150 155 160 

Tyr Gly Val Trp Ser Cys Glu Gly Cys Lys Ala Phe Phe Lys Arq Ser 
10 IBS 170 175 

lie Gin Gly His Asn Asp Tyr lie Cys Pro Ala Thr Asn Gin Cys Thr 
^ 190 185 190 

l c lie Asp Lys Asn Arq Arq Lys Ser Cys Gin Ala Cys Arg Leu Arg Lys 

195 200 205 

Cyi Tyr Glu Val Gly Met Val Lys Cys Gly Ser Arq Arq Glu Arg Cys 
210 215 220 

20 

Gly Tyr Arg Leu Val Arg Arg Gin Arg Ser Ala Aap Glu Gin Leu His 
225 230 235 240 

Cys Ala Gly Lys Ala Lys Arg Ser Gly Gly His Ala Pro Arg Val Arg 
25 245 250 255 

Glu Leu Leu Leu A-sp Ala Leu Ser Pro Glu Gin Leu Val Leu Thr Leu 
260 265 270 

30 Leu Glu Ala Glu Pro Pro His Val Leu He Ser Arg Pro Ser Ala Pro 

275 260 265 

Phe Thr Glu Ala Ser Net Met Met Ser Leu Thr Lys Leu Ala Asp Lys 
290 295 300 



35 



Glu Leu Val His Met lie Ser Trp Ala Lys Lys lie Pro Gly Phe Val 
305 310 315 320 

Glu Leu Ser Leu Phe Asp Gin Val Arg Leu Leu Glu Ser Cys Trp Met 
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325 330 335 

Glu Val Leu Met Met Gly Leu Met Trp Arc; Ser lie Asp His Pro Gly 
340 345 350 

5 

Lys Leu lie Phe Ala Pro Asp Leu Val Leu Asp Arg Asp Glu Gly Lys 
35b 360 36b 

Cys Val Glu Gly lie Leu Glu lie Phe Asp Met Leu Leu Ala Thr Thr 
10 370 375 380 

Ser Arq Phe Arg Glu Leu Lys Leu Gin His Lys Glu Tyr Leu Cys Val 
^) 385 390 395 400 

l c Lys Ala Met lie Leu Leu Asn Ser Ser Met Tyr Pro Leu Val Thr Ala 

^ 405 410 415 

Thr Gin Asp Ala Asp Ser Ser Arq Lys Leu Ala His Leu Leu Asn Ala 
420 425 430 

20 

Val Thr Asp Ala Leu Val Trp Val He Ala Lys Ser Gly He Ser Ser 
435 440 445 

Gin Gin Gin Ser Met Arg Leu Ala Asn Leu Leu Met Leu Leu Ser IT is 
25 450 455 460 

Val Arq His Ala Ser Asn Lys Gly Met Glu His Leu Leu Asn Met Lys 
465 470 475 4B0 

30 Cys Lys Asn Val Val Pro Val Tyr Asp Leu Leu Leu Glu Met Leu Asn 

485 490 495 

Ala Mis Val Leu Arg Gly Cys Lys Ser Ser lie Thr Gly Ser Glu Cys 
500 505 510 



35 



Ser Pro Ala Glu Asp Ser Lys Ser Lys Glu Gly Ser Gin Asn Pro Gin 
515 520 525 

Ser Gin 
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(2) INFORMATION FOR SEQ ID NO: 26: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

( B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: other nucleic ac.ici 




j 

(xi} SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
GTACACTGAT TTGTAGCTGG A 
■20 (2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 25 base pairs 

( B ) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

30 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
3 5 AGTAACAGGG CTGGCGCAAC GGTTC 

(2) INFORMATION FOR SEQ ID NO: 26: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

tii) MOLECULE TYPE: other nucleic acid 



10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 28: 
^fc ACTGGCGATG GAC C ACT AAA GG 22 
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Claims : 



5 
10 

2 . 

15 

3 . 

20 




4 . 

6 . 



Isolated DNA encoding a protein having an N-terninal 
domain, a DNA-binding domain and a ligand-binding domain, 
wherein the amino acid sequence of said DNA-binding 
domain of said protein exhibits at least 80% homology 
with the amino acid sequence shown in SEQ ID NO: 3, and 
the amino acid sequence of said ligand-binding domair of 
said protein exhibits at least 7C£ homology with the 
amino acid sequence shown in SEQ ID NO: 4. 

Isolated DNA according to claims 1, characterized in that 
the amino acid sequence of said DNA-binding domain of 
said protein exhibits at least 90%, preferably 95?;, more 
preferably 98%, most preferably 100% homology with the 
amino acid sequence shown in SEQ ID NO: 3. 

Isolated DNA according to claims 1 or 2, characterized in 
that the amino acid sequence of said ligand-binding 
domain of said protein exhibits at least 75%, preferably 
80%, more preferably 90%, most preferably 100% homology 
with the amino acid sequence shown in SEQ ID NO: 4. 

Isolated DNA according to claims 1 to 3, said DNA 
encoding a protein comprising the amino acid sequence of 
SEQ ID NO:b, SEQ ID NO: 6, SEQ ID NO:21 or SEQ ID NO:25. 

Isolated DNA according to claims 1 to 4, characterized in 
that said DNA comprises the nucleic acid sequence of SEQ 
ZD NO:l, SEQ ID NO:2, SEQ ID NO:20 or SEQ ID NO:24. 

A recombinant expression vector comprising the DNA 
according to any of the claims 1 to 5. 
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7. A cell transfected with DNA according to claims 1 to 5 or 
an expression vector according to claim 6. 

8. A coll according to claim 7 which is a stable transfected 
5 cell line which expresses the steroid receptor protein 

according to any of the claims 9 to 11. 

9. Protein encoded by DNA according to claims 1 to 5 or an 
expression vector according to claim 6. 

10 

a 10. Protein according to claim 9, said protein comprising the 
aaino acid sequence of SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID 
NO: 21 or SEQ ID NO:25. 

is li. Chimeric protein having an N-terminal domain, a DNA- 

binding domain, and a ligand-binding domain, 
characterized in that at least one of said domains of 
said chimeric protein originates from a protein according 
to claims 9 or 10, and at least one of the other domains 

20 of said chimeric protein originates from another receptor 

protein from the nuclear receptor superf amily, provided 
that the DNA-binding domain and the ligand-binding domain 
of said chimeric protein originates from different 
proteins , 



2S 



12. DNA encoding a protein according to claim 11. 

13. TJoe of a DNA according to claims 1 to 5 or 12, an 
expression vector according to claim 6, a cell according 
to claim 7 or 8 or a protein according to claim 9 to II 
in a screening assay for identification of new drugs. 
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14. A method for identifying functional ligands for the 
protein according to claims 9 to 11, said method 
comprising the steps of 

a) introducing into a suitable host cell 1) DNA 
5 according to claims 1 to 5 or 12, and 2) a suitable 

reporter gene functionally linked to an operative 
hormone response element, said IIRE being able to be 
activated by the DNA-binding domain of the protein 
encoded by said DNA; 
io b) bringing the host cell from step a) into contact 

a with potential ligands which will possibly bind to 

the ligand-binding domain of the protein encoded by 
said DNA from step a> ; 
' c) monitoring the expression of the protein encoded by 

is said reporter gene of step a) « 
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ABSTRACT 

The present invention relates to isolated DNA encoding 
novel estrogen receptors, the proteins encoded by said DNA/ 
chimeric receptors comprising parts of said novel receptors 
and uses thereof . 
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luciferase units 
Fig. 2 
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